Exploring the cobia (Rachycentron canadum) genome: unveiling putative male heterogametic regions and identification of sex-specific markers

Abstract Background Cobia (Rachycentron canadum) is the only member of the Rachycentridae family and exhibits considerable sexual dimorphism in growth rate. Sex determination in teleosts has been a long-standing basic biological question, and the molecular mechanisms of sex determination/differentiation in cobia are completely unknown. Results Here, we reported 2 high-quality, chromosome-level annotated male and female cobia genomes with assembly sizes of 586.51 Mb (contig/scaffold N50: 86.0 kb/24.3 Mb) and 583.88 Mb (79.9 kb/22.5 Mb), respectively. Synteny inference among perciform genomes revealed that cobia and the remora Echeneis naucrates were sister groups. Further, whole-genome resequencing of 31 males and 60 females, genome-wide association study, and sequencing depth analysis identified 3 short male-specific regions within a 10.7-kb continuous genomic region on male chromosome 18, which hinted at an undifferentiated sex chromosome system with a putative XX/XY mode of sex determination in cobia. Importantly, the only 2 genes within/between the male-specific regions, epoxide hydrolase 1 (ephx1, renamed cephx1y) and transcription factor 24 (tcf24, renamed ctcf24y), showed testis-specific/biased gene expression, whereas their counterparts cephx1x and ctf24x, located in female chromosome 18, were similarly expressed in both sexes. In addition, male-specific PCR targeting the cephx1y gene revealed that this genomic feature is conserved in cobia populations from Panama, Brazil, Australia, and Japan. Conclusion The first comprehensive genomic survey presented here is a valuable resource for future studies on cobia population structure and dynamics, conservation, and evolutionary history. Furthermore, it establishes evidence of putative male heterogametic regions with 2 genes playing a potential role in the sex determination of the species, and it provides further support for the rapid evolution of sex-determining mechanisms in teleost fish.

hardly to identify merely by genomic data.Again, it is dangerous to jump to the conclusion of sex determination system (XY or ZW) using genomic data without any population experiment validation.I would like to suggestion author should weaken their interpretation to draw such a conclusion.Reply: We thank the reviewer for the valuable feedback.We recognize the crucial importance of precision in interpreting sex chromosomes identification and understanding sex determination systems.We acknowledge the inherent challenges in defining sex chromosomes only through genomic data.Given the focus of this current work on identifying sex-associated regions and sex-related markers, we have tempered our interpretation regarding sex chromosome identification and male heterogametic sex determination system (XX/XY) for cobia (Lines 49,58,222,400,409,510).Furthermore, we have revised the title to: "Exploring the cobia (Rachycentron canadum) genome: Unveiling putative male heterogametic regions and identification of sex-specific markers".Also, the manuscript contains some minor issues such as inconsistent tense usage, occasional awkward phrasing, and punctuation errors.These do not detract significantly from the overall quality of the work but should be addressed in a minor revision.
Here are some examples, but I suggested that author should address issues through the whole manuscript: Inconsistent Tense Usage: Example: "grow faster than males both in body length and weight creating considerable differences between sexes in which females can be double the size of males at similar developmental stages".Suggested correction: This sentence should maintain consistent tense usage.For instance, "females grow faster than males in both body length and weight, creating considerable differences between the sexes.At similar developmental stages, females can be double the size of males."Reply: Changed as suggested by reviewer.Lines76-78.

Awkward Phrasing:
Example: "About 11.08% and 11.55% of the assembled male and female genome was annotated and identified as repetitive elements".Suggested correction: The phrase could be more concise and clear.For example, "Approximately 11.08% of the male genome and 11.55% of the female genome were annotated as repetitive elements."Reply: Changed as suggested by reviewer.Line 173 to 175.Punctuation Errors: Example: "Using the male genome as reference we employed two different strategies to identify the sex-specific region(s) in cobia".Suggested correction: A comma is needed after 'reference' for better readability.Reply: Thanks for the suggestion.We found there is already a comma after 'reference' in this sentence.Line 625.
Example: "the latter observation (17% intersex) being attributed to increased levels of endocrine disrupting compounds (EDC) from industrial and agricultural pollutants in local waterways".Suggested correction: This sentence could be restructured for clarity.For instance, "The observation of 17% intersex individuals is attributed to increased levels of endocrine disrupting compounds (EDC) from industrial and agricultural pollutants in local waterways."Reply: Changed as suggested by reviewer.Line 442.

Sentence Fragment:
Example: "Consequently it's improbable that these regions play an important role in sex determination".Suggested correction: This fragment could be integrated into a complete sentence for better flow."Consequently, we conclude that it is improbable these regions play an important role in sex determination."Reply: Changed as suggested by reviewer.Lines 239 to 240.
Reviewer #3: In this manuscript, Shen and colleagues reported the male and female genomic resources of cobia Rachycentron canadum, and narrowed down potential male-specific genomic region via resequencing male and female individuals.The authors also validated the sex-specific DNA markers with individuals coming from other populations.The introduction is adequate, and overall the manuscript is also well written.I have two minor comments for the authors to consider to improve the manuscript.
1) Title: The current title is "Genomic investigation of the cobia (Rachycentron canadum) reveals putative male heterogametic regions on chromosome 18".My understanding is that there were previous studies working on the karyotypes of Rachycentron canadum based on shape of metaphase chromosomes.Should I get it correctly, the authors assigned the number of chromosomes based on descending order of sizes (Table S4).Thus chromosome 18 named in previous karyotype study and the chromosome 18 named in this study can not be judged as if they are the same or not.In this case, "on chromosome 18" should be removed from the title.Reply: The title has now been revised to: "Exploring the cobia (Rachycentron canadum) genome: Unveiling putative male heterogametic regions and identification of sex-specific markers".In addition, we have incorporated pertinent information into the figure legend of Figure 1.Line 180 to 182 "In the male genome, chromosome numbering is organized in descending order according to chromosome assembly size.Whereas the female genome's chromosome numbering follows a one-to-one syntenic relationship with the male genome."2) GWAS and Fst analyses -The authors also detected potential association signals of GWAS on three other chromosomes (MChr4, 5,17).The authors wrote in line 228-231 as follows: "a more detailed examination revealed that the SNP genotypes within these regions did not consistently exhibit heterozygosity in males and homozygosity in females.Furthermore, the resequencing data showed comparable coverage of these regions in both males and females".I think the authors should at least: i) show all these results in the supplementary information, and ii) at least write in the main text what protein coding genes are included in the genomic regions.Reply: We sincerely appreciate the valuable suggestions provided by the reviewer.Within the Supplementary Table file (in Excel format), we have incorporated Supplementary Table S12 (GWAS-detected suggestive signals of sex association at Chr4, Chr5, and Chr17).Additionally, we have also included the text 'Supplementary Table S12' in the main paper, line 231-233: 'a more detailed examination revealed that the SNP genotypes within these regions did not consistently exhibit heterozygosity in males and homozygosity in females (Supplementary Table S12).'Consequently, the original Supplementary Table S12 and Table S13 have been re-designated as Table S13 and Table S14, respectively, in both the Supplementary Tables (Excel files) and the main paper.Regarding the request for supplementary information pertaining to the statement 'Furthermore, the resequencing data showed comparable coverage of these regions in both males and females', we regret to inform that the original "bam files" utilized for calculating coverage of each sample have been removed due to prolonged non-usage, as all analyses were conducted between 2021 and 2022.Should the reviewer deem additional proof necessary, we are open to removing the sentence accordingly.Furthermore, as requested by the reviewer, we have elaborated in the main paper on the inclusion of protein-coding genes within the three genomic areas (MChr4, MChr5, and MChr17).Lines 235 to 240, "Additionally, three genes were identified on MChr4: mdn1 (Midasin AAA ATPase 1), trm6 (tRNA Methyltransferase 6 Non-Catalytic Subunit), and fermt1 (FERM Domain Containing Kindlin 1), while no genes were detected on Chr5.On Chr17, ostm1 (Osteoclastogenesis Associated Transmembrane Protein 1) was identified.However, these genes have not previously been reported to have a functional role in sexual development and function.

Introduction
Cobia (Rachycentron canadum) is a large migratory pelagic fish with geographic distribution in tropical and subtropical waters worldwide with the exception of the eastern Pacific Ocean [1].It is a promising marine fish species with great aquaculture potential due to its desirable traits, such as excellent quality fillets, easy adaptation to captivity, high survival rates, tolerance to variations in temperature and salinity, and high growth rate [2].The species has been farmed in many countries around the world including China, Taiwan, and Hong Kong, and more recently expanding to Australia, Vietnam, and the American continent (USA, Brazil, Panama, Belize etc.) [3][4][5].Cobia exhibits a strong sexually dimorphic growth [6][7][8].Females grow faster than males in both body length and weight, creating considerable differences between sexes.At similar developmental stages, females can be double the size of males [1].
In light of this, it is widely acknowledged that monosex female breeding through artificial sex control can significantly boost cobia aquaculture yields [2].The morphological characteristics (i.e., secondary sexual traits) used for sexing are usually only observed after sexual maturation, and thus are not useful for sexing juvenile fish.Nevertheless, it is often useful to know the sex of juveniles, e.g. in aquaculture breeding programs.Relying solely on cobia morphology is not enough to distinguish their sex at any developmental stage, including postsexual maturity.Furthermore, a reliable approach for distinguishing cobia's genotypic sex has yet to be established.
The even representation of males and females within cobia populations suggests that there is a genetic system (i.e., a master gene) driving sex determination, and the species is considered gonochoristic [1].Rare occurrences of intersex individuals have been reported in India [28] and Australia [3], with the latter supposedly attributed to the presence of endocrine disrupting compounds in the water.Unfortunately, limited knowledge is available on the molecular mechanisms of sex determination and differentiation in this species.To date, there have been no reported sex chromosomes, sex-determining regions, or sex determination genes in cobia.Furthermore, cytologically there are no distinguishable sex chromosomes observed between genders, as male and female cobia show the same diploid number (2n = 48) and the same karyotype morphology [2,29].Hence, the lack of reliable genotypic and phenotypic approaches for distinguishing the sex of cobia presents a significant hurdle for practitioners seeking to optimize broodstock management, conduct molecular selective breeding, as well as advance the conservation of the species.Consequently, it is crucial to explore the genetic underprinnings of sex determination and develop molecular markers that permit non-invasive and early sexing of cobia individuals.
Cobia is the extant monotypic member of family Rachycentridae, order Carangiformes, which consists of six families.Three of these families (i.e., Rachycentridae, Coryphaenidae and Echeneidae) are within the super family Echeneoidea that comprise a monophyletic grouping [30,31].R. canadum was assumed to be closely related (sister groups) to the remoras (E. naucrates), within the family Echeneidae, based on the morphology (form, colour and fin shape) of juveniles [31].However, osteological examinations revealed a greater likelihood of sister groups between R. canadum and Coryphaena based on the larval morphology [31].In addition, a phylogenetic analysis of 138 putatively informative characters of eleven species (including R. canadum) resulted in a single most parsimonious tree and showed that Rachycentridae is the sister-group to Echeneidae [32].While phylogenetics of Carangoid based on the complete mitochondrial DNA supported that the relationship between R.
canadum and mahi-mahi (Coryphaena hippurus), within family Coryphaenidae, was the closest [33].Whereas these studies have shed light on R. canadum phylogeny in relation to other clades, the question of whether it is more closely related to Coryphaenidae or Echeneidae still remains controversial.Therefore, further studies are required to elucidate the phylogenetic relationships of R. canadum within the order Carangiformes and understand its evolutionary history.
Genomic resources for cobia are currently extremely limited, hindering a better understanding of the genetic basis of sex determination and differentiation, as well as the molecular mechanisms of remarkable sexual dimorphisms in this unique fish species.
Therefore, the goals of this study were to: i) assemble the first male and female chromosomelevel reference genome for cobia; ii) identify candidate sex-linked genomic regions, putative sex-determining genes, as well as develop affordable and rapid male-specific DNA markers to determine the genetic sex of cobia; and iii) elucidate the phylogenetic relationship between cobia and other teleosts via genome synteny.

Assembly and annotation of chromosome-level male and female cobia genomes
We sequenced one male and one female cobia using single tube long fragment read (stLFR) and Hi-C technologies, each sex with over 285-fold genome coverage (Supplementary Tables S1,S2).De novo genome assembly was performed on 87.07 Gb and 78.12 Gb of clean stLFR reads separately generated for male and female, respectively (Supplementary Table S1).A 586.23 Mb of the male genome was assembled with a contig/scaffold N50 of 86.0 kb/10.3Mb (Supplementary Table S3), which is close to the 585.72 Mb estimate from k-mer analysis (Supplementary Fig. S1a).The genome assembly size for the female was 583.56 Mb (accounting for 99.2% of the k-mer estimated 588.46 Mb) with a contig/scaffold N50 of 79.9 kb/6.3Mb (Supplementary Table S3 and Supplementary Fig. S1b).Detailed information on the estimation of the genome size based on K-mer analysis is shown in Supplementary File Note 1.To further improve the genome assembly and anchor the scaffold sequences to chromosomes, we generated 81.5 Gb and 103.1 Gb Hi-C data for the male and female, respectively (Supplementary Table S2).By incorporating the Hi-C data, 586.51 Mb of the male genome was assembled with a scaffold N50 of 24.3 Mb, whereas the female genome assembly size was 583.88 Mb with scaffold N50 of 22.5 Mb (Supplementary Table S3).A total of 563.06 Mb (96.00% of the assembly) of the male and 537.27Mb (92.02% of the assembly) of the female genome sequence were ordered and oriented into 24 pseudo-chromosomes, respectively (Fig. 1a, Supplementary Fig. S2 and Table S4).
This outcome was consistent with the previous report on cobia karyotype (2n = 48) [29].All 24 chromosomes of the male and female genomes showed a clear one-to-one syntenic relationship (Fig. 1a and Supplementary Table S4).The quality of the two genome assemblies was assessed in two aspects: a) complete and single copy BUSCO scores of 94.2% (male) and 93.8% (female) (Supplementary Table S5); and b) an average of 96.45% and 97.88% RNASeq reads from gonadal tissues of cobia [8] could be mapped to the male and female genome assemblies, respectively.These results indicate that the assembled genomes were high quality.
A total of 21,604 and 21,688 protein-coding genes were separately annotated in the male and female genome assembly (Supplementary Table S3), and over 99% of them were annotated by functional database (Supplementary Table S6).The BUSCO evaluation of the protein sequences identified 93.1% and 92.6% of complete single copy genes for the male and female genome assembly, respectively (Supplementary Table S5).Approximately 11.08% of the male genome and 11.55% of the female genome were annotated as repetitive elements (Supplementary Tables S7, S8).We also identified 1,304 and 1,289 non-coding RNAs, with a total length of 116.9 kb and 117.0 kb in the male and female genomes, respectively (Supplementary Table S9).

Phylogenetic construction and evolution analysis reveals R. canadum and E. naucrates as sister groups
To investigate the evolutionary relationship of cobia (R. canadum) and related teleosts, a phylogenetic tree was constructed using 572 single-copy orthologous genes of cobia and nine other fish species (Supplementary Fig. S3 (Johnson, 1984) and confirmed that R. canadum and E. naucrates were sister groups (Fig. S5).From the estimates of divergence time, the ancestor of R. canadum separated from the ancestor of E.
naucrates approximately 51.4 million years ago (Mya).The ancestor of R. canadum and E.
naucrates separated from the ancestor of S. lalandi and T. ovatus approximately 84.0 Mya (Fig. 1b).In addition, the 24 pseudo-chromosomes of cobia had a clear one-to-one relationship to E. naucrates (Supplementary Fig. S6a), while seven chromosomes (6, 7, 10, 13, 14, 15 and 17) of R. canadum were observed to have a hit to two or three chromosomes of T.
ovatus (Supplementary Fig. S6b).Unfortunately, the chromosomal level genome of S. lalandi was not available, so no syntenic relationship was explored between R. canadum and S. lalandi.

Characterization of sex-specific regions in cobia
To locate the sex-specific genomic region(s) of cobia, a total of 2,681 Gb of filtered whole genome resequencing (WGRS) data were generated from 91 individuals (31 males and 60 females), with an average of ~49-fold depth per sample (Supplementary Table S10).Using the male genome as reference, an average mapping rate of 99.0% per sample was obtained (Supplementary Table S10).In total, 551,838 filtered single nucleotide polymorphisms (SNPs) were detected.The genome-wide association analysis (GWAS) using the male genome as reference revealed a single peak (-log p-values of up to 244.37) with 162 SNPs significantly associated with sex, spanning over a region of ~4.04 Mb (559.54 kb to 4.59 Mb) on male chromosome 18 (MChr18) (Fig. 2a, 2b and Supplementary Table S11).Most importantly, the 162 strongly sex-associated SNPs showed the same pattern where all thirty-one males were heterozygous, but homozygous for all sixty females (Supplementary Table S11).These results hinted at a putative male heterogametic or potential XX/XY model of sex determination, with a fully sex-linked region on MChr18.In addition, the principal component analysis (Fig. 2c) and a neighbour-joining tree (Fig. 2d) calculated using the SNPs from MChr18 showed that male and female individuals clustered into two distinct groups.Moreover, the relative component of genetic differentiation (estimated as Fst) between males and females further confirmed the region detected by GWAS (Fig. 2e).Hence, both the GWAS and Fst scan, which takes genetic structure into consideration, consistently identified a peak genomic region on MChr18, showing the highest probability as a sex-associated region in cobia.While GWAS indicated potential association signals in three other genomic areas (MChr4, MChr5, and MChr17) (Fig. 2a), a more detailed examination revealed that the SNP genotypes within these regions did not consistently exhibit heterozygosity in males and homozygosity in females (Supplementary Table S12).Furthermore, the resequencing data showed comparable coverage of these regions in both males and females.Additionally, three genes were identified on MChr4: mdn1 (Midasin AAA ATPase 1), trm6 (tRNA Methyltransferase 6 Non-Catalytic Subunit), and fermt1 (FERM Domain Containing Kindlin 1), while no genes were detected on Chr5.On Chr17, ostm1 (Osteoclastogenesis Associated Transmembrane Protein 1) was identified.However, these genes have not previously been reported to have a functional role in sexual development and function.Consequently, we conclude that it is improbable these regions play an important role in sex determination.were detected from its re-annotation (Supplementary Table S13).Most importantly, all the gaps presenting within the sex-linked region and its flanking regions in the original MChr18 were fully filled.GWAS analysis was carried out on the newly assembled MChr18, which detected 232 SNPs significantly associated with sex in a single peak.Consistently, all 232 SNPs showed that all 31 males were heterozygous, but all 60 females were homozygous (Supplementary Table S14).Interestingly, further sequencing depth analysis revealed that three short male-specific regions of Y1 (400 bp; 3,187,350 to 3187750 bp), Y2 (1,100 bp; 3,195,150 to 3,196,250 bp) and Y3 (1,000 bp; 3,197,050 to 3,198,050 bp) within a continuous region of 10.7 kb were discovered within the sex region, which showed no WGRS reads mapped from 60 females (corresponding depth of zero for females), while with mean depth of 0.5 (haploid copy specific to males) in 31 males (Fig. 3a, 3b), suggesting that the Y1, Y2 and Y3 are putative male determining regions that could contain the candidate master sexdetermining genes.In addition, the θπ value analysis showed that the divergence mainly came from the male group (Fig. 3c).

The cephx1y and ctcf24y: the putative drivers of cobia sex determination
We further scanned the 10.7 kb sex-associated region on MChr18.A short insertion (540 bp within the 6 th intron region) in a functionally annotated gene of epoxide hydrolase 1_ephx1 (3,184,084 to 3,188,235 bp) was identified in the male-specific region Y1.There were no genes detected in region Y2 and Y3 (Fig. 3a).However, another gene, transcription factor 24_tcf24 (3,190,353 to 3,193,193 bp), was detected between Y1 and Y2 (Fig. 3a).These two genes, especially the ephx1, were considered of high interest for male function in cobia.In addition, both genes were also found in the homologous female chromosome of FChr18 (19.29 Mb).Alignment of ephx1 and tcf24 genomic sequences in MChr18 and FChr18 revealed a high nucleotide identity of 96.6% and 95.9%, respectively.InDel (insertion-deletion) variants with variable length and SNPs also existed in both gene (coding regions and introns) comparison groups (Supplementary File: Genomic DNA sequence alignment of ephx1 and tcf24).The two genes on MChr18 were termed cobia ephx1y (cephx1y) and cobia tcf24y (ctcf24y), and cephx1x and ctcf24x for Fchr18.As nucleotide sequence divergence impacts protein sequence, gene structure predictions were performed for both genes.The results showed that cephx1x spans about 3.78 kb and consists of seven introns and eight exons (Fig. 4a1) encoding 455 amino acids (Fig. 4b1).However, only six introns and seven exons were detected for cephx1y with a total length of 4.15 kb, which showed one exon (VII) absent (Fig. 4a1) encoding 416 amino acid residues (Fig. 4b1).
The nucleotide identity between exon sequences of cephx1x and cephx1y ranged from 95.1% to 98.3% with an average of 97.3% (Fig. 4a1).A closer look at cephx1x and cephx1y revealed that the largest sequence differences were two InDels of 165-bp of X-specific insertion and 540-bp of Y-specific insertion in the non-coding regions (Fig. 4a1), while the remaining were randomly distributed SNPs and short InDels (Supplementary file).It is noteworthy that the Y1 region was in fact the male-specific fragment specifically inserted in the sixth intron of the cephx1y (Fig. 4a1 and Supplementary file).In terms of tcf24, the ctcf24y contained the same number of exons (four) and introns (three) as ctcf24x (Fig. 4a2), ctcf24y spans 2,840 bp and encodes 202 amino acids, while ctcf24x has 3,024 bp with a translated protein product of 204 amino acids (Fig. 4b2).The observed sequence identity in exons was 96.9 to 98.8%.There was also a large 220-bp X-specific insertion in the third intron of ctcf24x, and several small InDels and SNPs were also detected between them (Fig. 4b1 and Supplementary file).Moreover, we built a structural model for both genes.The Ephx1 is a protein coding gene, the Cephx1y protein folds is essentially identical to Cephx1x (Cα rootmean-square deviation of 1.108Å).While the Cephx1x had an extra 41-amnio-acid helix-turnhelix domain (missing in Cephx1y), which plays an important role in the stability of the protein (Fig. 4c1).In addition, the amino acid alignment of the two ephx1 genes of cobia and other fish like E. naucrates, E. lanceolatus and Seriola dumerili revealed that the loss of the helix-turn-helix domain existed only in the Cephx1y (Supplementary File).Both cobia tcf24 counterparts lack a fixed or ordered three-dimensional structure, and a total of nine amino acid differences at six sites were detected between Ctcf24x and Ctcf24y (Fig. 4c2).
A further investigation of the expression pattern of ephx1 and tcf24 by examining the cobia gonadal transcriptome [8] showed that both cephx1y and ctcf24y were significantly differentially expressed between males and females (Fig. 4d).The ctcf24y was more highly expressed in testis (FPKM: 7.27) than ovaries (FPKM: 0.28), with the Log2 (Fold Change) of ovary/testis of -4.8.In addition, the expression of cephx1y was observed in all five testis samples (1.5 ≤ FPKMs ≤ 6.3), but only in one of five ovary samples (0 ≤ FPKMs ≤ 0.02), with the log2FC = -9.5;indicating this gene was nearly exclusively expressed in male cobia (Fig. 4d).
The cephx1x and ctcf24x showed no significant differential expression between testes and ovaries.In addition, PPI network analysis showed that the cephx1y interacts directly and significantly with three cyp1 genes: cyp1a1, cyp1b1 and cyp1d1 (Fig. 4e).Testis-specific expression was observed for cyp1a1, while cyp1b1 was overexpressed in the ovary, and cyp1d1 was absent [8].The ephx1 gene together with cyp1a1 and cyp1b1 are involved in the metabolism of xenobiotics through the cytochrome P450 pathway (Supplementary Fig. S7).
In addition, the cephx1y also interacts indirectly with hsd17b1 and three other CYP components: cyp19a1a (gonadal aromatase), cyp19a1b (brain aromatase) and cyp3a65 (predicted to enable steroid hydroxylase activity in zebrafish).

Sex specific DNA markers development and population specificity validation
To develop sex-specific markers, two forward (F) and two reverse (R) primers were designed to target the male-specific fragment insertion region within the sixth intron of cephx1y.The first set of primers, cephx1_1 (F1-R1), was designed to amplify a 404 bp region where there was an absolute deletion of the fragment for cephx1x in the region of FChr18, therefore the PCR would only amplify the "Y" sequence.The second set of primers, cephx1_2 (F2-R2), had the forward primer located in the male-specific region of cephx1y, while the reverse primer (R2) was situated in a region common to both cephx1y and cephx1x, resulting in a predicted PCR product size of 359 bp.Following the PCR validation of eight male and eleven female cobia individuals from Panama, both pairs of primers successfully amplified a male-specific product in all the unambiguously phenotypically sexed males (Fig. 4f).
Furthermore, distinct patterns were observed for the two markers during the analysis of cross-population validation.Similarly to Panama population, both cephx1_1 and cephx1_2 successfully amplified a PCR product from all males and was absent from females within the population of Brazil.Sanger sequencing of the PCR products, using both sets primers, confirmed the accuracy of the target sequences in both Panama and Brazilian populations.
However, cephx1_1 did not show any amplification in both males and females in Japan and Australia populations, while cephx1_2 successfully amplified a shorter 180 bp product exclusively in males and females showed no amplification.In contrast, a longer 359 bp product was amplified from male DNA originating from the Brazilian and Panama populations (Fig. 4f).Subsequent Sanger sequencing of PCR products confirmed the presence of the 180 bp fragment in Japan and Australia populations.In addition, the failure of cephx1_1 to amplify in Japan and Australia populations is due to its reverse primer being situated within the 179 bp (359 bp -180 bp) missing region.

Discussion
The absence of a chromosome-level reference genome for cobia posed a significant challenge for in depth genomic analysis for the species, such as the investigation on its phylogenetic relationship with other teleosts and sex determination mechanisms.In the present study, we successfully obtained two high-quality genomes for both sexes of cobia using a combined strategy involving stLFR and Hi-C technologies.The high BUSCO score of 94.2% for the male and 93.8% for the female indicates the completeness of the two genome assemblies.
To our knowledge, this represents the first annotated chromosome-level reference genome of the species.These resources will provide researchers with opportunities to explore the molecular mechanisms controlling cobia's sex-determination system, and other economically important traits through genomic selection for faster growth, disease resistance and highquality fillets.Moreover, it may also allow the development of further in-depth studies to better understand the biology of the species, such as how this large pelagic migrant inhabits all tropical and subtropical oceans of the globe and to inform more sustainable fisheries management practices.
The study of sex determination systems in teleosts can be technically challenging as most of them possess undifferentiated sex chromosomes and have various complex and diverse mechanisms for determining sex [34,35].Genome-wide association studies (GWAS) have gained popularity in exploring sex determination mechanisms, enabling the identification of sex-linked markers, sex determination loci and candidate genes [36][37][38][39].In this study, we conducted GWAS to investigate the genetic basis of sex determination in cobia and identified a sex-associated region on MChr18, with 232 most significantly sex-linked SNPs that presented as heterozygote genotypes in all the males and complete homozygosity in all the females of captive Panama cobia population.The high density of sex-specific SNPs was a feature of the putative sex determination locus, which has also been observed in sex determination studies of other aquatic species [36,40].Moreover, this male-specific heterozygosity pattern suggested that the cobia may possess a putative male heterogametic sex determination system (XX/XY) which would be consistent with being a gonochoristic species [1].In addition, the selection signatures of Fixation index Fst and sequencing depth analysis served to further strengthen the evidence for the potential sex determination locus identified through GWAS.Taken together, the integration of GWAS, Fst scanning and sequencing coverage analysis identified a strongly sex-linked region and provided the evidence that the MChr18 is the potential undifferentiated homologue containing sex-specific loci, which is in agreement with the previous karyotype analysis where no morphologically distinct sex chromosomes for cobia were found [29].The combination of these strategies has proven to be an effective approach for investigating the putative sex determination mechanism in cobia, as well as in other species [36,41,42].Furthermore, it is noteworthy that the sex associated region on MChr18 identified preliminary in the male genome assembly by combining the stLFR and Hi-C reads contained several gaps with variable length within the region and the flanking regions.The highly repeated content of this region complicated the assembly of the region.By further adding the PacBio HiFi reads, we obtained a small gap-free sex-linked region of 10.7 kb.This highlights the power of using long sequencing reads to assemble highly repetitive and complex genomic regions.
The small sex-associated region characterized on MChr18 contains two putative master sex-determining genes for cobia, ephx1 (a short male-specific fragment insertion within Y1) and tcf24 (between Y1 and Y2).Although ephx1 has not previously been reported as a master sex gene or linked to sex determination, it is known to regulate endogenous steroid metabolism (i.e., androgens and estrogens), suggesting a functional role in sexual development and function in mammals [43].A previous study in humans showed that upon treatment with an ephx1 inhibitor, a decrease in estradiol formation was seen in ovaries [44].
In mice, ephx1 is up-regulated in the embryo-containing oviduct and is thought to play a role in preimplantation embryo development [45].However, its reproductive function in fish remains poorly studied [46][47][48].Here, the cepxh1y was observed to be nearly exclusively expressed in the testes of adult fish by examining a recently published gonadal transcriptome of cobia [8].In addition, a small male-specific insertion was detected in the sixth intron of cephx1y, which was in fact the identified male-specific region of Y1.Moreover, the loss of exon VII (a 41-amnio-acid helix-turn-helix domain) was only observed in the cephx1Y when compared to that of cephx1x and the other fish species and mammals.All these findings suggest that cephx1y could be a potential sex-determining gene in cobia.In addition, the PPI network analysis showed that cephx1y exhibits direct or indirect interactions with six Cyp genes and hsd17b1.The Cyp genes, specifically P450 aromatase (cyp19a1a), is known to have a crucial function in the development of gonads in various fish species [49,50].The hsd17b1, a gene involved in steroidogenic pathway, has been recognized as a master sex-determining gene in yellowtail species [26], which belong to the same order (Carangiformes) as cobia.
The ephx1 gene encodes microsomal epoxide hydrolase (EPHX1), an enzyme known to be involved in the metabolism of xenobiotics and is thought to mediate functions including bioactivation and detoxification of environmental deleterious compounds [45,51].The occurrence of cobia intersex individuals has been reported in India [28] and Australia [3].The observation of 17% intersex individuals is attributed to increased levels of endocrine disrupting compounds (EDC) from industrial and agricultural pollutants in local waterways.
Interestingly, the EPHX1 enzyme has been reported to be involved in xenobiotic metabolism and regulates endogenous steroid metabolism [43].Therefore, it stands to reason that EDC exposure could have an effect on EPHX1 catalytic activity, disrupting its functional associations with the cytochrome P450 family which mediate sex determination and differentiation pathways and potentially cause aberrations in gonadal development of cobia, more so given that cephx1y is the only male-specific gene detected in the species.Fish exhibit a wide variety of sex-determining genes [(for review see [17]], and more 'newcomers' with no previously known role in sex determination have also been discovered in recent years, such as Paics in blue tilapia (Oreochromis aureus) [52] and bcar1 in channel catfish (Ictalurus punctatus) [27].The present study has identified cephx1y as a novel potential sexdetermination gene in cobia, offering new knowledge on the molecular mechanisms involved in teleost sex determination.Further functional experiments, such as genome-editing, are necessary to confirm and further explore these findings, as well as to clarify the complete mechanism by which cephx1y might modulate aromatase activity or other steps of the steroidogenic pathway in cobia and potentially other teleosts.
The transcription factor tcf24 was first described in humans in 2002, but its functions remain largely unknown [53].The only publication related to this factor in fish revealed that tcf24 is upregulated in the hindbrain of individually-housed three-spine stickleback (Gasterosteus aculeatus) as a molecular basis for social behaviour [54].The ctcf24y was significantly up-regulated in the testes by examining the data from the cobia gonadal transcriptome [8], suggesting a potential role in testicular differentiation of cobia.It's important to note that tcf24 has a paralog, tcf23 (also called OUT), which plays a role in mammalian reproduction.In humans, tcf23 is a newly identified decidual mediator of progesterone action [55].While in mice, it was expressed in adult reproductive tissues, e.g.uterus, ovaries and testes [56], indicating its potential role in male and female reproductive biology.Studies in fish have also shown that tcf23 was highly upregulated in the ovaries of coho salmon (Oncorhynchus kisutch) after treatment with 11-KT (11-Ketotestosterone).In rainbow trout, tcf23 was detected exclusively in the gonads of both sexes [57].To date, functional studies of both tcf24 and tcf23 on reproductive biology are still very limited, particularly in teleost fishes.Thus, further detailed functional characterization of tcf24 is required to understand its potential role in sex determination and differentiation in cobia.
Identifying reliable and universally applicable sex-linked markers in fish poses challenges due to the considerable variability in sex-determination genes and systems, even among closely related species and within populations of the same species [58][59][60].The current research successfully developed and validated two male-specific PCR-based markers (overlapping amplicons) targeting cephx1 for cobia population of Panama.These two markers were validated and shown to be amplifiable only in males among individuals from Brazil.In the Japan and Australia populations, primers for cephx1_1 did not amplify, while primers for cephx1_2 amplified a shorter 180 bp products in males compared to the 359 bp found in the Brazil and Panama populations, revealing a shorter cephx1y intron 6 in the Asian and Australian populations when compared with the two populations from the Americas.
Nevertheless, the absence of cephx1y in cobia females in far distant populations across the globe indicates a conserved role of cephx1y as a key putative sex-determining gene for the species.The development of this simple sex-specific PCR tool has the potential to significantly improve artificial fertilization and precise breeding in the cobia aquaculture industry, and ultimately leading to the development of monosex populations and increased productivity.
Additionally, it aids non-lethal sampling and improves animal welfare in breeding programs.
The placement of the cobia (R. canadum) within the Carangiformes order and its status as the only member of the Rachycentridae family are well established.However, there have been divergent findings regarding its phylogenetic relationship to other species, particularly whether it is more closely related to the Coryphaenidae or Echeneidae [31-33, 61, 62].In the current study, comparative genome analysis between R. canadum and one Echeneidae (E. to be further investigated when dolphinfish genomes become available.

Conclusions
We have successfully assembled and annotated high quality chromosome-level reference genomes for male and female cobia, which will provide a valuable resource for future investigations into the population structure, evolutionary history, fisheries management, and conservation of cobia and other Carangiformes species.Furthermore, the findings of this study suggests that cobia may harbor a putative male heterogametic (XX/XY) genetic sex determination system, with two genes cephx1y and ctcf24y as potential putative main drivers of cobia sex determination.Notably, cephx1y could represent a putative novel sexdetermining gene, which further supports the rapid evolution of sex-determining mechanisms in teleost fish.Moreover, our development of a practical PCR-based method for identifying genetic sex in cobia can assist in the breeding monosex female populations in commercial farming of the species.

Experiential fish and sample collection
The majority of cobia individuals used in this study were obtained from Open Blue Sea Farms, the Republic of Panama.One male and one female adult fish at two-years-old were sampled for the whole genome de novo sequencing and assembly.In addition, a total of ninety-one fin clips from adult fish (thirty-one males and sixty females) were sampled for whole genome resequencing.For the development and validation of sex-specific DNA markers, the ovary and testis tissues were dissected from five male and five female fish, and nine fin clips (from the 91 referenced above) were chosen from three male and six female fish.Moreover, to validate the population specificity of the sex-specific DNA markers, fin clips from adult fish were obtained from three additional cultured populations in Japan (3 males and 5 females), Brazil (8 males and 5 females) and Australia (8 males and 10 females).Sex of fish individuals was determined through cannulation or gonadal observations.

Genome sequencing
High quality and molecular weight genomic DNA was extracted from fin clips of male and female cobia with QIAamp DNA purification kit (Qiagen) in accordance with the manufacturer's protocol.Paired-end single tube long fragment reads (stLFR) libraries [63] and Hi-C libraries were constructed using published protocols available via protocols.io[64] , and sequenced on the BGISEQ-500 platform (BGI, Shenzhen, China; RRID:SCR_017979) [65], yielding 100 bp paired-end (PE) reads.Barcodes were first split from stLFR raw reads and subsequently filtered by Soapfilter v2.2 (parameter: -y -p -M 2 -f -1 -Q 10) to generate high quality sequences.The genome sizes of the male and female cobia were estimated based on k-mer analysis (k=17) using Jellyfish v2.2.6 [66] and Genome Scope v1.0 [67].Genome size was estimated with the formula genome size (Mb) = K-mer number/K-mer depth.For PacBio sequencing, high molecular weight gDNA from testis was extracted using a standard phenol/chloroform method.The testis was selected because it ensures certainty regarding the sex and it yielded high quality DNA.The integrity of the extracted DNA was assessed by 0.75% agarose gel electrophoresis, and the concentration was quantified by a Qubit 4 Fluorometer (Thermo Fisher Scientific, Inc., USA).Ten micrograms of DNA was then used to construct the library for PacBio SMRT Sequencing using the SMRTbell express template prep kit (PacBio, Menlo Park, CA, USA).The library was sequenced using the PacBio Sequel II System (RRID:SCR_017990) with HiFi sequencing modes .

Genome assembly
The high quality paired-end single tube long fragment reads (stLFR), with read length of 100 bp, were used for initial genome assembly by employing the 10X Genomics software supernova [67].First, the format of high quality reads was transformed to 10X Genomics format, and then the male and female cobia genomes were separately assembled with Supernova v2.1.1.To further improve the quality of the assembly, Gapcloser (v1.12;RRID:SCR_015026) [68] was used with default parameters to fill gaps.Furthermore, Purge_haplotigs (RRID:SCR_017616) [69] was used to reduce redundancy of the initial assembly.The uniformity and completeness of the cobia male and female genome assemblies were evaluated by the read mapping rate as well as BUSCO (Benchmarking Universal Single-Copy Orthologs) [70].Finally, chromosome level assemblies were constructed using Hi-C data.
Valid reads (the reads with contact information after processing of HiC-Pro pipeline, including read alignment, detection and filtering of valid interaction products, binning and contact map normalization) were used for assignment of contigs or scaffolds to chromosomes.Juicer v1.5 (RRID:SCR_017226) [72] and 3D-DNA (3D de novo assembly) [73] was used to anchor the male and female cobia genome assembly onto pseudo-chromosomes.In order to enhance the continuity of the sex-associated region, we further conducted genome assembly of a male cobia using PacBio reads.The obtained HIFI long reads were fed to hifiasm (v0.14.1-r314;RRID:SCR_021069) with the default parameters, and the primary assembly result p ctg.gfa file was converted into fasta format with in-house scripts.
Using the male genome as reference, we employed two different strategies to identify the sex-specific region(s) in cobia.A genome-wide association study (GWAS) was first performed using EMMAX [92], mixed linear model, to test whether any of the SNPs identified were significantly associated with sex.The first 10 PCs of PCA from plink (v1.90b6.12)were used as concomitant variables at the same time.Secondly, we calculated depth of each site for all ninety-one samples using samtools-depth module (v-1.9 ) [93].The average depth distribution analysis between the male and female group (bin 50 bp, normalized per sequencing depth of each sample) was also employed by exploiting the difference in sex chromosome ploidy between males and females.The Fst (Wright's fixation index) between male and female groups and θπ (nucleotide diversity) of each group were calculated by vcftools (v0.1.13[94]. A variant density approach was performed by searching for differences in SNP density between males and females.The protein-protein interaction (PPI) network prediction (http://cn.string-db.org)[95] was adopted for the identified putative master sex-determining gene for cobia.

Figure 1 :
Figure 1: Overview of male and female cobia genome features.(a) Landscape of the 24 assembled cobia chromosomes.In the male genome, chromosome numbering is organized in descending order according to chromosome assembly size.Whereas the female genome's chromosome numbering follows a one-to-one syntenic relationship with the male genome.From the outer to the inner: A) GC_content, B) Transposable element content density, C) Gene density, D) Chromosomes, E) Syntenic relationship of female (left of the circle) and male (right of the circle) chromosomes.(b) Phylogenetic tree of 10 vertebrate genomes constructed using 572 single copy orthologous genes.The numbers (blue) on the branches represent the estimated divergence time in million years ago (Mya).Time span in brackets were the 95% confidence interval of divergence time, and red circles indicate the calibration time from fossil.All nodes had support values of 100%.

Figure 2 :
Figure 2: Genome-wide distribution of SNPs from 31 males and 60 females.(a) Manhattan plot showing -log10 p-value of each SNP from the GWAS investigating sex-associated regions on the cobia genome.The horizontal line indicates the genome-wide significance threshold -log10(P)= 7.7.(b) The SNP distribution on Chr18.The 162 SNPs significantly associated with sex spanning over a region of ~4.04 Mb (559.54 kb to 4.59 Mb).(c) Principal component analysis of ninety-one individuals using SNPs.(d) Phylogenetic tree showing relationships of females (blue) and males (orange).(e) Genome-wide scan of fixation index (Fst) matching the results from the GWAS.The genome-wide difference of sequencing depth between males and females was also analysed to identify the sex-specific region(s) in cobia.By investigating the mean depth (sites depth/average depth), the sex-linked region identified above was further narrowed down on MChr18.Unfortunately, an unacceptable number of gaps with variable length were observed within the region and the flanking regions.To improve the contiguity of this sex-associated

Figure 3 :
Figure 3: (a) Alignments of the stLFR reads from male and female individuals to the sex-associated regions.(b) Average depth normalized per group.For both figure (a) and (b), the letters of Y1, Y2 and Y3 stand for the distinct regions between males and females.(c) Genetic diversity of sex-determining region.Blue line indicates the female group, and yellow line represents the male group.

Figure 4 :
Figure 4: (a) Schematic representation gene structure of cephx1y and cephx1x, ctcf24y and ctcf24x in cobia.The red boxes indicate sex-specific insertions.(b) Amino acid sequence alignment of cephx1y and cephx1x, as well as ctcf24y and ctcf24x of cobia.(c) Structural model of Cobia ephx1 (c1_Blue colour) and tcf24 (c2_Green colour).The red color represents an extra helix-turn-helix domain in cephx1; the purple color indicates amino acid differences between the paralogs.N, N terminus; C, C terminus.(d) The expression profile of ephx1 and tcf24 in the testes and ovaries based on gonadal transcriptomic analysis."*"and "**" indicate p < 0.01 and p < 0.001, respectively.(e) The interaction of cephx1y with CYP components (green and yellow shaded) and other proteins based on the STRING PPI network.Nodes naucrates) as well as two Carangidaes (T.ovatus and S. lalandi), showed that R. canadum and E. naucrates were sister groups, and the ancestor of R. canadum separated from the ancestor of E. naucrates approximately 51.4 million years ago.This investigation represents the first of its kind and provides insights into the evolutionary relationship of R. canadum through comparative genomic and phylogenetic analysis.Unfortunately, genomes of the only two species within Coryphaenidae family, the mahi-mahi or common dolphinfish (C.hippurus) and the pompano dolphinfish (C.equiselis), are not yet available.Therefore, a more complete study of the evolution of the Rachycentridae genome (cobia as a single representative) needs